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Abstract 

The use of statistical potentials in NMR structure calculation improves the accuracy of the final structure but also raises 
issues of double counting and possible bias. Because statistical potentials are averaged over a large set of structures, they 
may not reflect the preferences of a particular structure or data set. We propose a Bayesian method to incorporate a 
knowledge-based backbone dihedral angle potential into an NMR structure calculation. To avoid bias exerted through the 
backbone potential, we adjust its weight by inferring it from the experimental data. We demonstrate that an optimally 
weighted potential leads to an improvement in the accuracy and quality of the final structure, especially with sparse and 
noisy data. Our findings suggest that no universally optimal weight exists, and that the weight should be determined based 
on the experimental data. Other knowledge-based potentials can be incorporated using the same approach. 
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Introduction 

Structural data measured by NMR spectroscopy are never 
complete. Even the most carefully collected data will by themselves 
not allow us to determine the three-dimensional structure of a 
biomolecule with atomic resolution. Rather, we need to interpret 
the data in the light of prior knowledge that is typically encoded in 
a potential function or force field [1]. 

Potential functions quantify the forces and interactions within a 
biomolecule and with its environment. Two fundamentally 
different approaches of designing potential functions are com- 
monly used [2] . Physics-based force fields [3] aim to approximate 
the underlying physical laws. Statistical or knowledge-based 
potentials [4] are learned from a structure database and describe 
the effective forces resulting from all interactions including those 
with the solvent. Physical and statistical potentials are comple- 
mentary in the sense that some interactions cannot be broken 
down easily into fundamental, physical contributions but are 
captured more effectively by potentials derived from known 
structures. 

In NMR structure calculation, potential functions are used to 
guide the calculation towards structures of high quality and 
accuracy [1]. This guidance is needed because NMR measure- 
ments by themselves do not allow us to determine the three- 
dimensional structure of an entire macromolecule. To "let the 
data speak for themselves" and also for reasons of computational 
efficiency, one tends to use minimalist force fields that ignore 
complex effects such as electrostatic screening or solvent interac- 
tions. If additional potentials such as dihedral angle [5,6] or 
hydrogen bonding potentials [7] are used, their force constants are 
set ad hoc and held fixed during the structure calculation. 



However, it might be necessary to adjust the force constants for 
each data set. Because knowledge-based potentials represent 
averages over large sets of structures, they are not universally 
transferable and may not represent the preferences of a particular 
structure. Think of a backbone dihedral angle potential, a 
"Ramachandran potential", as an example. The minimal energy 
configuration of the Ramachandran potential is completely alpha- 
helical. Therefore it seems more appropriate to choose a higher 
weight for helical proteins than for all-beta proteins. 

Here we introduce an objective, data-driven approach to find 
the optimal force constant for a given protein and data set. Our 
method is based on statistical mechanics and Bayesian inference 
and allows us to incorporate knowledge-based potential functions 
without biasing the structure calculation. 

Results 

Statistical potential for backbone dihedral angles 

Protein backbone dihedral angles ip and i// show a typical 
correlation pattern, an observation made by Ramachandran et al. 
[8] assuming only hard-sphere steric repulsion between atoms. 
Standard nonbonded energies used in NMR structure determina- 
tion [9] do not fully capture all aspects of distributions 
observed in high-resolution crystal structures [10] The dihedral 
angle distributions obtained from nonbonded interactions do not 
reproduce the empirical distribution (see the bottom row of 
Figure 1 and Figures S2-S4 in File SI). Depending on the quality 
of the data, NMR structures can show dihedral angles outside the 
allowed regions of the Ramachandran plot [11,12]. For this reason 
it is common practice [13] to assess the Ramachandran statistics of 
NMR structures by programs such as Procheck [14], MolProbity 
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[15] and WhatCheck [16]. To obtain more regular NMR 
structures, various dihedral angle potentials derived from database 
statistics have been developed [5,6]. The functional forms of these 
potentials range from two-dimensional histograms [17] to 
continuous representations based on linear interpolation, cubic 
splines and statistical density estimation [18-20], Some of these 
models ignore that ip/ij/ distributions are smooth and periodic, 
which can result in artifact in the refinement [21]. 

We use nonparametric density estimation to derive a backbone 
potential and expand the joint Lp/\jj distribution into a Fourier 
series [22]. This representation is inherendy smooth and periodic 



and has the advantage that it can easily represent multimodal 
distributions. Each distribution is a linear combination of 80 two- 
dimensional cosine and sine functions resulting from the combi- 
nation of five frequencies in the if and \\i dimensions. The 
estimated distributions capture features such as the alpha-helical 
peak and regions corresponding to parallel and anti-parallel beta 
sheets. Also rare secondary structures such as left-handed helices 
are represented accurately. Figure 1 shows the estimated dihedral 
distribution of three representative amino acids and the corre- 
sponding empirical histograms (see Figure S2 in File SI for the full 
set of dihedral angle distributions). 
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Figure 1. Backbone dihedral angle distributions of amino acids estimated from high-resolution crystal structures. Density maps of 
ip/r/i distributions for Alanine (left column), Proline (middle column) and Glycine (right column) as approximated by the maximum entropy 
distribution (top row). The middle row shows the empirical ip/ip distribution computed over a non-redundant structure database. The bottom row 
shows the (p/ij/ distribution obtained by sampling structures from the nonbonded force field. 
doi:1 0.1 371 /journal.pone.01 001 97.g001 
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Data-driven weighting of the backbone potential 

We use the probabilistic Inferential structure determination 
approach (ISD) [23,24] to determine protein structures from 
experimental data. In a standard ISD calculation, one explores the 
posterior probability Pr(f)|Z>)oc Pr(D\6) Pr(0) of all conforma- 
tional degrees of freedom 9 given the experimental data D. The 
posterior distribution itself is proportional to the likelihood function 
Pl(D\8)cc exp{ — Wdata£data(0)} and the prior probability Pr(0). 
The likelihood is the probability of the data given the structure and 
involves -Edata(9), a measure of the goodness-of-fit between the data 
and a particular structure 6; the weight Wdata allows us to balance 
the data against the prior probability. The prior distribution is 
typically a Boltzmann distribution at inverse temperature 
w v \ rss = \/kj i T, Pr(0)oc exp{ - w p hy S £phys(^)} resulting from 
the force field .Ephys- We incorporate the newly derived back- 
bone potential E mmli (Q) by extending the prior distribution: 
Pr (0|w ra ma) gc exp{ - Wphy S -Ephys(0) - w tW aE Tam ^(9)} where w ram a 
is the weight of the backbone potential. 

The weight of the backbone potential H' rama is unknown and has 
to be chosen somehow. Naively, we would set it to one (w rama = 1). 
But this is problematic because some aspects of the Ramachan- 
dran plot are already captured by the force field. Figures 1 and S3 
in File SI show that structures calculated on the basis of the 
nonbonded force field iiphys already reproduce the rough outline 
of the f/4> basins. But there are more subde aspects such as 
optimal hydrogen bonding geometry [25] that result in pro- 
nounced peaks, which are not reproduced by the force field alone. 
As a consequence, the force field and the backbone potential are 
not independent of each other but are positively and negatively 
correlated depending on the energy range (Figure 2A). Setting the 
ip/ip weight to a large value risks that we overemphasize these 
contributions in the combined potential. In the limiting case, we 
will force the structure into a helical conformation, whereas with 
too small w rama the effect of the backbone potential becomes 
negligible. Therefore, we need to adjust w ra ma according to the 
experimental data and the structure. 

We have introduced a Bayesian approach to estimate the weight 
of the experimental data Wd ata relative to the prior probability 
[26]. This approach exploits the fact that for every conformation 
we can calculate how well it agrees with the data and that its 
goodness-of-fit determines the weight of the data Wdata- The same 
is not possible for w rama , because the statistics reflected by the 
backbone potential recapitulate an ensemble property, and 
therefore we need to assess how well the entire ensemble agrees 
with the Ramachandran statistics. Thus it is computationally 
much more demanding to adjust w rama than Wdata- 

To estimate vv rama , we compare the differences between the 
expected backbone energy {-Emma) where <■> denotes an 
ensemble average. (£ r aina) 1BjllI summarizes how the force field 
and the backbone potential are correlated independent of any 
data. To obtain this ensemble average, structures are sampled 
based on the combined energy Wphys-Ephys + w rama is rama . This value 
is contrasted with the expected backbone energy obtained with 
data ^-Erama ) dat , ■ To calculate this ensemble average, structures are 
sampled based on the full energy Wdata-E'data + Wphys-Ephys + 
Wrama -£rama- If <- £ 'rama> d , to <<£'rama> no<latl , tne addition of the 
experimental data improves the backbone energy, and we can 
increase w rama because the data comply with the Ramachandran 
statistics. If the data contradict the backbone potential, 
■(-Erama> d!1K , > ^-Brama ) no dM!1 , we must lower the weight because 
the backbone potential biases the ensemble. The optimal weight 
achieves 



<£rama > data - <£rama > nQ ^ ■ ( 1 ) 

This rule can be derived rigorously [2 7] by maximization of the 
model evidence, Pr (-D|iv rama ), which is the probability of observing 
the data for a particular value of w ra ma and whose computation 
involves an ensemble average. It can be shown that the derivative 
of the model evidence is the difference between the average 
Ramachandran potential under the posterior and the prior (i.e. 
with and without data) [27]. At the maximum evidence the 
derivative vanishes, therefore both energies are the same and the 
curves cross. If we choose the weight from this region, we bias our 
ensembles the least. 

Bayesian weighting with high-quality data 

We used Bayesian weighting to analyze the high-quality data for 
ubiquitin (PDB code: ld3z). We estimated the optimal weight from 
the NOE-based distances and used the additional scalar and 
dipolar data for validation. Figure 2B shows the model evidence 
Pr (D\w 

rama), which peaks at the optimal weight satisfying 
equation (1). Assuming a uniform prior probability for w ra ma, 
the estimated weight of the backbone potential is 
Wrama = 0.94 + 0.09. An additional control for evaluating the 
Bayesian choice of the Ramachandran weight are the residual 
dipolar couplings (RDCs) and scalar coupling measurements that 
are available for ubiquitin. For each of the 1 1 RDC sets, we 
calculated the average QTactor obtained for different choices of 
Wrama- Figure 2C shows that the Bayesian choice of the 
Ramachandran weight improves the QTactor to 
2 = 0.19 + 0.04. Similarly, we see an improvement in the fit with 
the scalar coupling measurements (Fig. 2D). 

Bayesian weighting with incomplete data 

Next, we studied how the weight changes for sparse versions of 
the ubiquitin data. To do so, we introduce a completeness 
parameter X, which controls the amount of data [28]. For k< 1, 
the effective number of observations is reduced. As observed 
before [27], the model evidence broadens and shifts towards 
smaller values if we reduce the number of observations (Fig. 3). 
The estimated weights are: w rama = 0.94 + 0.09 (A =1.0), 
Wrama =1-00 + 0.09 (A = 0.1), w rama = 0.92 + 0. 1 (1 = 0.01), and 
Wrama = 0.84 + 0.15 (A = 0.005). If we reduce the number of data 
further, the system undergoes a phase transition because the 
posterior no longer peaks at the native ensemble, and w rama is 
pushed toward zero. For all values of 2, the optimal weight falls in 
the range that results in a minimal RMSD to the crystal structure. 
The Bayesian choice of w ram a generates the most accurate 
structures. 

Impact on structure ensembles from sparse and noisy 
NMR data 

So far, we studied how Bayesian weighting of the backbone 
potential impacts the conformational ensemble under artificially 
sparsified data. We also looked at challenging real-world structure 
determination problems, a sparse set of distances for the Fyn-SH3 
domain [23,29] and noisy distance bounds measured with solid- 
state NMR on the a-spectrin SH3 domain [30]. We estimated the 
Ramachandran weight for both data sets and obtained 
Wrama = 0.76 + 0. 1 1 and Wrama = 0.3 + 0.13 for the sparse and 
noisy distances, respectively (see also Figure S7 in File SI). 

Incorporation of the backbone potential can significandy 
improve the accuracy of the ensemble as measured by the RMSD 
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Figure 2. Bayesian weighting of the backbone potential for ubiquitin inferred from high-quality distance data. A: Correlation between 
backbone potential and nonbonded force field. Shown is the joint distribution of physics- and knowledge-based contributions in the absence of any 
structural data. (The energies of the crystall structure are _E p h ys = —455.5 and E mmil = 187.3.) B: Model evidence Pr(Z)|iv rama ) as a function of the 
Ramachandran weight u' rama . C: Influence of the Ramachandran weight on the average Q-factor (red dashed line) calculated for 1 1 RDC data sets that 
were not used in the structure calculation. The Q-factor reflects the agreement between experimental and calculated RDCs. The dotted black line 
indicates the average Q-factor of the crystal structure (PDB code: 1 ubq). D: Influence of the Ramachandran weight on the fit with scalar coupling 
measurements (red dashed line). Six three-bond scalar coupling data sets are available for ubiquitin and have not been used in the structure 
calculation. The dotted black line indicates the average Q-factor of the crystal structure (PDB code: 1 ubq). The grey distribution indicates the model 
evidence Pr (D\Wrmm)- 
doi:1 0.1 371 /journal.pone.01 001 97.g002 



to the crystal structure (Figure 4). But we also observe that an 
overly strong backbone potential can do more harm than good. 
For large WYama values, the RMSD distribution deteriorates and 
even shows multiple peaks in case of the sparse data set. With an 
optimally weighted backbone potential also the accuracy of the 



mean structure is consistently higher than the accuracy of the 
individual members of the structure ensemble (see Table S 1 in File 
SI). This indicates that the structure ensembles are better defined 
when using the backbone potential. Figure 4 shows that the model 
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Figure 3. Impact of incomplete ubiquitin data on u' ra ma- Shown is the model evidence as a function of n' ram a (grey) and the average RMSD 
(dots). The sparsity increases from the top left panel to the bottom right panel. 
doi:1 0.1 371 /journal.pone.01 001 97.g003 



evidence peaks where the average RMSD to the crystal structure 
shows a minimum. 

Figure 5 shows the structure ensembles obtained with the sparse 
SH3 data for different choices of w ra ma- If the weight is zero or too 
small, the ensemble is still quite heterogeneous, especially in the 
loops. When incorporating the backbone potential with an optimal 
weight, the ensemble becomes very regular and accurate: the 
average structure is surprisingly close to the reference structure 
(1.05 A RMSD) given the sparseness of the data set. For too large 
weight, we introduce conflicts between the preferences of the 
statistical potential and the data by introducing helical structure in 
beta strands. The corresponding Ramachandran plots illustrate 
these findings. For w ra ma = 5, the Ramachandran plot becomes 



artificially narrow and peaks in the helical region. Another 
indication that the Bayesian choice of w rdm!i is optimal, is provided 
by the behavior of additional model parameters. Figure S8 in File 
SI shows that the estimated weight of the data U'data [26] is largely 
unaffected if we incorporate the knowledge-based contribution. 
That is, we do not compromise the data by downweighting them 
upon integration of the backbone potential. 

In Figure 6 we plot the effective potential function (i.e. the 
negative log-posterior probability) incorporating the force field, the 
backbone potential and the goodness-of-fit for different values of 
Wrama corresponding to the ensembles shown in Figure 5. The 
results indicate that optimal weighting helps to guide the 
simulation towards more accurate structures. Without the 
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Figure 4. Impact on structure ensembles from sparse and noisy NMR data. Panels A, C show the results for the sparse Fyn-SH3 data set. 
Panels B, D show the results for the solid-state data. The top row displays the RMSD distributions with u' rama = 0 (white), H' ram a = 5 (black) and optimal 
it'rama (grey). The grey distribution shown in the bottom panels is the model evidence as a function of the weight H' mm . 
doi:1 0.1 371 /journal.pone.01 001 97.g004 



backbone potential, structures at the bottom of the energy funnel 
show a broad range of RMSDs between 1.5 and 2.5 A. For 
optimal w rama , the funnel narrows and selects structures with an 
RMSD below 1.5 A. With too large w rama we observe a negative 
correlation between the RMSD and the negative log-posterior 
probability. 

Impact on structure quality 

Figure 7 shows the average values of several validation criteria 
for structures generated at different weights (a full report of the 
Procheck and WhatCheck quality criteria can be found in Table 
SI in File SI). All reported criteria are z-scores that provide an 
assessment of how a particular structure compares to the average 



of all known protein structures in terms of standard deviations. 
That is, a z-score of —0.5, say, means that the quality criterion of 
this particular structure is half a standard deviation below the 
average. 

As expected, the Ramachandran score (RAMCHK) improves 
with increasing w ram a and shows no saturation, which demon- 
strates that it is not a valid indicator for selecting W lama . The effect 
of the backbone potential on the NQACHK score is small; only 
for ubiquitin it agrees with the Bayesian choice of w rama . 
WhatCheck's packing score QUACHK shows a different behav- 
ior, it increase steadily with increasing w rama . BBCCHK assesses 
the regularity of the backbone and shows a steady increase with 
increasing w r ama indicating a high correlation with the potential 
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Figure 5. Influence of the weight u rama on the structural ensemble of Fyn-SH3 inferred with sparse NMR data. Shown are the 
conformations and backbone dihedral distributions generated with different u' rama . Panels A-C display structure ensembles comprising ten randomly 
selected conformations (grey) superimposed onto the crystal structure (red). Panels D-F show in black a maximum entropy distribution fitted to the 
backbone torsion angles of the structures generated with ISD. The backbone dihedral angles of the crystal structure are marked by red dots. Panels A, 
D show the results for iv r ama = 0.0, panels B, E: iv ra ma=0.76 (optimal weight), panels C, F: w mmlt = 5.0 (maximum weight probed during replica- 
exchange simulations). 
doi:1 0.1 371 /journal.pone.01 001 97.g005 
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Figure 6. Energy funnels obtained with the sparse Fyn-SH3 data at different Ramachandran weights (left: "Y am;l -0.0, middle: 
w r ama = 0.76, right: w rama = 5.0). The full ISD energy (negative log-posterior probability) is plotted against the RMSD to the Fyn-SH3 crystal structure. 
doi:1 0.1 371 /journal.pone.01 001 97.g006 
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Figure 7. Influence of the Ramachandran weight on various quality criteria. Shown is the impact of u rama on WhatCheck validation criteria. 
Each column reports the results for a different data set (left column: ubiquitin, middle column: Fyn-SH3 domain, right column: a-spectrin SH3 
domain). Each row shows the evolution of a quality score with increasing ir rama (each dot marks the average over 1 00 structures that were randomly 
selected from the ISD ensemble, dashed lines are added to guide the eye). The first row reports the Ramachandran appearance as assessed by 
RAMCHK. The second and third row show WhatCheck's packing scores. The last row reports the regularity of the backbone (BBCCK). The grey 
distribution indicates the model evidence Pr(Z)|u rama ) as a function of ir rama . 
doi:1 0.1 371 /journal.pone.01 001 97.g007 



Ramachandran potential. Although the ubiquitin data are of a 
high quality, the structure ensemble still improves upon incorpo- 
ration of the backbone potential. As for ubiquitin, we observe an 
overall improvement in the quality of the ensembles obtained with 
the sparse and noisy SH3 data. The slight decrease of the average 
NQACHK score for the optimal weight ensemble in the case 
alpha-spectrin is within the ensemble spread; again the score is 
only weakly affected by the Ramachandran weight (see Figure S5 
in File SI). 

The WHATCHECK validation criteria by themselves are 
unable to choose a Vf ramil that would lead to a global improvement 
of the structural quality as well as the accuracy of the ensemble. 
Moreover, although some of the scores seem to be highly 
correlated (e.g. RAMCHK and BBCCHK, see Figure S6 in File 
SI) it is not clear whether it is possible to maximize all scores 
simultaneously. Rather we have to find a comprise between the 
different quality criteria, and this is exactly what our weighting 
scheme achieves. 



Discussion 

We outline a new formalism to integrate physics- with 
knowledge-based potential functions in biomolecular structure 
calculation. Our approach is founded on Bayesian principles and 
allows us to incorporate prior knowledge derived from structure 
databases. The new method is data-driven and adaptively weighs 
knowledge-based contributions relative to the force field and the 
experimental data thereby reducing potential bias and artifact. We 
show that even simple terms such as knowledge-based backbone 
potentials have a significant effect on the quality of the structure 
ensemble. The optimally weighted dihedral angle potential 
improves not only the Ramachandran appearance but also the 
backbone normality and packing scores. Moreover, it systemati- 
cally produces more accurate and more precise structure 
ensembles. The combination of physics- and knowledge-based 
potential functions is particularly powerful for sparse and noisy 
NMR data and shifts the ensemble closer to the native structure. 
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Our findings suggest that there is no universal weight that can be 
transferred to all proteins and data sets. Rather, the method of 
choice is to estimate the weight in the course of the structure 
calculation. 

For the solid-state SH3 data, we find the smallest weight 
Wrama*0.3; for the ubiquitin data we estimate the largest weight 
Wrama~l- We can rationalize this behavior as follows. For high- 
quality data (high completeness, low noise level), the optimal 
weight adopts a large value and drops as noise and sparseness 
increase [27]. This behavior seems counterintuitive at first sight 
but is sensible: The forces that pull the ensemble towards the 
correct structure are weaker with low-quality than with high- 
quality data. We therefore have to soften the backbone potential to 
not overwhelm the data. 

In the future, we plan to extend our method to weigh multiple 
statistical energy terms simultaneously in the course of a structure 
calculation. However, this will require a more efficient algorithm 
for estimating multi-dimensional densities of states, because the 
computational complexity of the approach becomes prohibitive. 
The naive extension of the presented approach would involve a 
multi-dimensional replica-exchange simulation in which replicas 
are introduced for every combination of the weights that we want 
to estimate. Therefore the computational burden grows expo- 
nentially in the number of weights such that with our current 
algorithm it is only possible to estimate up to two or three 
weights. The final goal is to design an efficient, unbiased but 
highly informative conformational prior distribution that allows 
the calculation of high quality ensembles from very sparse data 
sets. 

Materials and Methods 

Data sets 

Backbone dihedral angles were extracted from PDBselect25 
[3 1] and used to estimate angular distributions for all amino acids 
using the maximum entropy method (see next section). We 
illustrate the impact of the backbone potential on three NMR 
data sets. The first data set (PDB code: ld3z) comprises high- 
quality data for ubiquitin. All distance data were reduced to 1444 
non-redundant restraints, additional data (scalar coupling con- 
stants, residual dipolar couplings) were not included in the 
structure calculation but used for validation. The second data set 
comprises sparse distance data for the Fyn-SH3 domain [23,29] 
(PDB code: lzbj). The third data set has been measured with 
solid-state NMR on the a-spectrin SH3 domain [30] (PDB code: 
lm8m). The solid-state data are very generous distance bounds 
ranging from 4.5 A to 7.5 A out of which 90% are equal or 
greater than 6 A, which is the largest distance bound obtained in 
standard solution NMR. 

Maximum entropy distributions for backbone dihedral 
angles 

Following Pertsemlidis et al. [22], we use a maximum entropy 
distribution with a Fourier basis to describe the distribution of 
backbone dihedral angles: 

pM) = 7/ I ^ ex P{~ £ ™. M)} (2) 

Z(a,D,c,a) 

where the Ramachandran potential £ r ama(f> 1 /') is given by 



k 

^ramaOftlW = [dij COS (iip) COS (j\jj) + by COS (iip) sin (JlJ/) 

v=i (3) 
+ Cy sin (iip) cos (j\p) + d v - sin (iip) sin (jip)] , 

Z(a,b,c,d) normalizes the dihedral angle distribution, and k = 5 is 
the order of the Fourier expansion (see Figure SI in File SI). We fit 
the expansion coefficients a,b,c,d to observed ip/x// pairs by using 
the maximum entropy method, which, in this case, is equivalent to 
maximum likelihood parameter estimation. We approximate the 
normalization constant Z(a,b,c,d) using the two-dimensional 
trapezoidal rule. To avoid over-fitting of the ip/l// distributions, 
we introduce a Gaussian prior with unknown precision X over the 
expansion coefficients: 



p(a,b,c,d\X)= 




The precision of the prior X is not known and is estimated 
simultaneously with the expansion coefficients. We use an iterative 
scheme in which we cycle through updates of the expansion 
coefficients and of the precision. For fixed precision, the negative 
log-posterior probability of the expansion coefficients is a convex 
function, which we optimize using the Powell minimizer [32]. The 
update of the precision can be calculated analytically. 

Inferential structure determination 

Inferential structure determination (ISD) [23,24] is a probabi- 
listic framework for biomolecular structure calculation from 
experimental data. ISD uses Bayes's theorem to obtain a posterior 
distribution over all unknown parameters including the confor- 
mational degrees of freedom 6 (typically main and side chain 
torsions) and all additional model parameters a [33]. The posterior 
distribution is proportional to the product of two terms, the 
likelihood function and the prior probability distribution. The 
likelihood function, Pr(D|0,a), is the conditional probability of the 
experimental data D viewed as a function of the parameters 6 and 
a. Here we consider NMR distance measurements, which we 
model using the lognormal distribution [34]. This introduces two 
model parameters, the distance scale and error, that we estimate 
simultaneously with the structure. We use the lognormal model for 
the analysis of the ld3z and lzbj data, which provide distance 
restraints. In case of the solid-state data (PDB code: lm8m), only 
lower and upper distance bounds are given. We apply a new 
probabilistic model (Habeck, in preparation) that estimates a set of 
experimental distances falling between the lower and upper 
bounds. For given estimated experimental distances, the lognor- 
mal model is used to relate the distance data with the structure. 
This model has in addition to the set of unknown experimental 
distances only one model parameter, the distance error. 

Optimization of the potential function 

We aim to find an optimal potential function for each dataset by 
adjusting the influence of the backbone potential E rdmd . The 
combined potential function is given by Wphys-Ephys + Wrama-Erama 
where w p h ys = l/kftT is the reciprocal temperature involving 
Boltzmann's constant k# and the absolute temperature T. Here 
-Ephys was set to the Lennard-Jones potential adapted from the 
Rosetta software [35]. The only free parameter is the weight of the 
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backbone potential vv rama . The model evidence Pr(.D|w ra ma) can 
be interpreted as the probability of the experimental data for a 
particular vv rama . The optimization of Pr(Z>|vi> rama ) is demanding 
as the calculation depends an intractable high-dimensional 
integral: 

PrCD|w rama )= Jpr(Z>|0,a,w rama ) Pr(0|w rama ) Pr(oc)d0da. 

over all model parameters a and conformational degrees 
of freedom 6. Here Pr(0|w rama )oc exp{ - Wphys-Ephys^) - 
lv rama£rama(0)} is the combined prior probability of conformation 
6 for a given weight w rama . We can reduce the computation to a 
low-dimensional integral by expressing the above equation using 
the density of states giCErama) 



PrODK ama ) = 



V~ H 'ratna-£'rama A p 

„ V - ^rama^rama H 7T , 

ia/ c Ulj rama 



The density of states is given as 

^(■^rama)- ^(^rama - 

where S(-) denotes the Dirac delta function. Estimates of the 
density of states are obtained by applying multiple histogram 
reweighting [36,37] as outlined in [27]. 



Replica-exchange Monte Carlo 

To estimate the density of states we generate conformations 
using an extended replica-exchange Monte Carlo scheme 
(REMC). The idea behind REMC is to simulate a system in 
parallel at different temperatures [38]. In our scheme, two 
temperature parameters control the force field and the likelihood 
function independently [39]. We treat w rama as a third temper- 
ature parameter. Across the first 30 replicas w r ama decreases from 
five to zero while the force field and the data are fully taken into 
account. In the remaining 50 replicas, the force field and the data 
are gradually switched off as described by Habeck et al. [39] . The 
convergence of the method depends on the size of the system as 
well as the quality and quantity of the data. In case of the tested 
systems, convergence was achieved after 30000 to 100000 replica 
transitions corresponding to 2-3 days worth of computation on a 
80 node cluster depending on the size of the system. 

Supporting Information 

File SI Supporting file including supporting text, 
Figures S1-S8, and Table SI. 

(PDF) 
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